Representation Of Examples
Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs
Learning useful representations is a key ingredient to the success of modern machine learning. Currently, representation learning mostly relies on embedding data into Euclidean space. However, recent work has shown that data in some domains is better modeled by non-euclidean metric spaces, and inappropriate geometry can result in inferior performance. In this paper, we aim to eliminate the inductive bias imposed by the embedding space geometry. Namely, we propose to map data into more general non-vector metric spaces: a weighted graph with a shortest path distance. By design, such graphs can model arbitrary geometry with a proper configuration of edges and weights. Our main contribution is PRODIGE: a method that learns a weighted graph representation of data end-to-end by gradient descent. Greater generality and fewer model assumptions make PRODIGE more powerful than existing embedding-based approaches. We confirm the superiority of our method via extensive experiments on a wide range of tasks, including classification, compression, and collaborative filtering.
Chromatic Feature Vectors for 2-Trees: Exact Formulas for Partition Enumeration with Network Applications
Allagan, J., Morgan, G., Langley, S., Lopez-Bonilla, R., Deriglazov, V.
We establish closed-form enumeration formulas for chromatic feature vectors of 2-trees under the bichromatic triangle constraint. These efficiently computable structural features derive from constrained graph colorings where each triangle uses exactly two colors, forbidding monochromatic and rainbow triangles, a constraint arising in distributed systems where components avoid complete concentration or isolation. For theta graphs Theta_n, we prove r_k(Theta_n) = S(n-2, k-1) for k >= 3 (Stirling numbers of the second kind) and r_2(Theta_n) = 2^(n-2) + 1, computable in O(n) time. For fan graphs Phi_n, we establish r_2(Phi_n) = F_{n+1} (Fibonacci numbers) and derive explicit formulas r_k(Phi_n) = sum_{t=k-1}^{n-1} a_{n-1,t} * S(t, k-1) with efficiently computable binomial coefficients, achieving O(n^2) computation per component. Unlike classical chromatic polynomials, which assign identical features to all n-vertex 2-trees, bichromatic constraints provide informative structural features. While not complete graph invariants, these features capture meaningful structural properties through connections to Fibonacci polynomials, Bell numbers, and independent set enumeration. Applications include Byzantine fault tolerance in hierarchical networks, VM allocation in cloud computing, and secret-sharing protocols in distributed cryptography.
- North America > United States > New York (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > North Carolina > Pasquotank County > Elizabeth City (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.68)
- Information Technology > Data Science > Data Mining > Feature Extraction (0.65)
- Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.65)
Educational Cone Model in Embedding Vector Spaces
Human-annotated datasets with explicit difficulty ratings are essential in intelligent educational systems. Although embedding vector spaces are widely used to represent semantic closeness and are promising for analyzing text difficulty, the abundance of embedding methods creates a challenge in selecting the most suitable method. This study proposes the Educational Cone Model, which is a geometric framework based on the assumption that easier texts are less diverse (focusing on fundamental concepts), whereas harder texts are more diverse. This assumption leads to a cone-shaped distribution in the embedding space regardless of the embedding method used. The model frames the evaluation of embeddings as an optimization problem with the aim of detecting structured difficulty-based patterns. By designing specific loss functions, efficient closed-form solutions are derived that avoid costly computation. Empirical tests on real-world datasets validated the model's effectiveness and speed in identifying the embedding spaces that are best aligned with difficulty-annotated educational texts.
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.63)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.51)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Polynomial Neural Sheaf Diffusion: A Spectral Filtering Approach on Cellular Sheaves
Borgi, Alessio, Silvestri, Fabrizio, Liò, Pietro
Sheaf Neural Networks equip graph structures with a cellular sheaf: a geometric structure which assigns local vector spaces (stalks) and a linear learnable restriction/transport maps to nodes and edges, yielding an edge-aware inductive bias that handles heterophily and limits oversmoothing. However, common Neural Sheaf Diffusion implementations rely on SVD-based sheaf normalization and dense per-edge restriction maps, which scale with stalk dimension, require frequent Laplacian rebuilds, and yield brittle gradients. To address these limitations, we introduce Polynomial Neural Sheaf Diffusion (PolyNSD), a new sheaf diffusion approach whose propagation operator is a degree-K polynomial in a normalised sheaf Laplacian, evaluated via a stable three-term recurrence on a spectrally rescaled operator. This provides an explicit K-hop receptive field in a single layer (independently of the stalk dimension), with a trainable spectral response obtained as a convex mixture of K+1 orthogonal polynomial basis responses. PolyNSD enforces stability via convex mixtures, spectral rescaling, and residual/gated paths, reaching new state-of-the-art results on both homophilic and heterophilic benchmarks, inverting the Neural Sheaf Diffusion trend by obtaining these results with just diagonal restriction maps, decoupling performance from large stalk dimension, while reducing runtime and memory requirements.
- North America > United States > Wisconsin (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania (0.04)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)
Thompson Sampling for Multi-Objective Linear Contextual Bandit
Park, Somangchan, Ann, Heesang, Oh, Min-hwan
We study the multi-objective linear contextual bandit problem, where multiple possible conflicting objectives must be optimized simultaneously. We propose \texttt{MOL-TS}, the \textit{first} Thompson Sampling algorithm with Pareto regret guarantees for this problem. Unlike standard approaches that compute an empirical Pareto front each round, \texttt{MOL-TS} samples parameters across objectives and efficiently selects an arm from a novel \emph{effective Pareto front}, which accounts for repeated selections over time. Our analysis shows that \texttt{MOL-TS} achieves a worst-case Pareto regret bound of $\widetilde{O}(d^{3/2}\sqrt{T})$, where $d$ is the dimension of the feature vectors, $T$ is the total number of rounds, matching the best known order for randomized linear bandit algorithms for single objective. Empirical results confirm the benefits of our proposed approach, demonstrating improved regret minimization and strong multi-objective performance.
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Asia (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
- North America > United States > Utah (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.42)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States > New York (0.04)
- Asia > Middle East > Israel > Southern District > Beer-Sheva (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.51)
- Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.42)
Model-Agnostic Private Learning
Raef Bassily, Abhradeep Guha Thakurta, Om Dipakbhai Thakkar
We design differentially private learning algorithms that are agnostic to the learning model assuming access to a limited amount of unlabeled public data. First, we provide a new differentially private algorithm for answering a sequence of m online classification queries (given by a sequence of m unlabeled public feature vectors) based on a private training set. Our algorithm follows the paradigm of subsample-and-aggregate, in which any generic non-private learner is trained on disjoint subsets of the private training set, and then for each classification query, the votes of the resulting classifiers ensemble are aggregated in a differentially private fashion. Our private aggregation is based on a novel combination of the distance-to-instability framework [26], and the sparse-vector technique [15, 18]. We show that our algorithm makes a conservative use of the privacy budget. In particular, if the underlying non-private learner yields a classification error of at most α (0, 1), then our construction answers more queries, by at least a factor of 1/α in some cases, than what is implied by a straightforward application of the advanced composition theorem for differential privacy. Next, we apply the knowledge transfer technique to construct a private learner that outputs a classifier, which can be used to answer an unlimited number of queries. In the P AC model, we analyze our construction and prove upper bounds on the sample complexity for both the realizable and the non-realizable cases. Similar to non-private sample complexity, our bounds are completely characterized by the VC dimension of the concept class.
- North America > United States > Ohio (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- (3 more...)
- Education > Educational Setting > Online (0.69)
- Information Technology > Security & Privacy (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)